<!DOCTYPE html>
<html>
<head>
  <!-- hexo-inject:begin --><!-- hexo-inject:end --><meta charset="utf-8">
  
  <title>CRF Layer on the Top of BiLSTM - 3 | CreateMoMo</title>
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
  <meta name="description" content="2.3 CRF loss functionThe CRF loss function is consist of the real path score and the total score of all the possible paths. The real path should have the highest score among those of all the possible">
<meta property="og:type" content="article">
<meta property="og:title" content="CRF Layer on the Top of BiLSTM - 3">
<meta property="og:url" content="http://createmomo.github.io/2017/10/08/CRF-Layer-on-the-Top-of-BiLSTM-3/index.html">
<meta property="og:site_name" content="CreateMoMo">
<meta property="og:description" content="2.3 CRF loss functionThe CRF loss function is consist of the real path score and the total score of all the possible paths. The real path should have the highest score among those of all the possible">
<meta property="og:locale" content="default">
<meta property="og:updated_time" content="2017-10-12T21:51:39.084Z">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="CRF Layer on the Top of BiLSTM - 3">
<meta name="twitter:description" content="2.3 CRF loss functionThe CRF loss function is consist of the real path score and the total score of all the possible paths. The real path should have the highest score among those of all the possible">
  
  
    <link rel="icon" href="/favicon.png">
  
  
    <link href="//fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
  
  <link rel="stylesheet" href="/css/style.css"><!-- hexo-inject:begin --><!-- hexo-inject:end -->
  

</head>

<body>
  <!-- hexo-inject:begin --><!-- hexo-inject:end --><div id="container">
    <div id="wrap">
      <header id="header">
  <div id="banner"></div>
  <div id="header-outer" class="outer">
    <div id="header-title" class="inner">
      <h1 id="logo-wrap">
        <a href="/" id="logo">CreateMoMo</a>
      </h1>
      
    </div>
    <div id="header-inner" class="inner">
      <nav id="main-nav">
        <a id="main-nav-toggle" class="nav-icon"></a>
        
          <a class="main-nav-link" href="/">Home</a>
        
          <a class="main-nav-link" href="/archives">Archives</a>
        
      </nav>
      <nav id="sub-nav">
        
        <a id="nav-search-btn" class="nav-icon" title="Search"></a>
      </nav>
      <div id="search-form-wrap">
        <form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit">&#xF002;</button><input type="hidden" name="sitesearch" value="http://createmomo.github.io"></form>
      </div>
    </div>
  </div>
</header>
      <div class="outer">
        <section id="main"><article id="post-CRF-Layer-on-the-Top-of-BiLSTM-3" class="article article-type-post" itemscope itemprop="blogPost">
  <div class="article-meta">
    <a href="/2017/10/08/CRF-Layer-on-the-Top-of-BiLSTM-3/" class="article-date">
  <time datetime="2017-10-08T00:05:31.000Z" itemprop="datePublished">2017-10-08</time>
</a>
    
  </div>
  <div class="article-inner">
    
    
      <header class="article-header">
        
  
    <h1 class="article-title" itemprop="name">
      CRF Layer on the Top of BiLSTM - 3
    </h1>
  

      </header>
    
    <div class="article-entry" itemprop="articleBody">
      
        <h4 id="2-3-CRF-loss-function"><a href="#2-3-CRF-loss-function" class="headerlink" title="2.3 CRF loss function"></a>2.3 CRF loss function</h4><p>The CRF loss function is consist of the real path score and the total score of all the possible paths. The real path should have the highest score among those of all the possible paths.</p>
<p>For example, if we have these labels in our dataset as shown in the table:</p>
<div class="table-container">
<table>
<thead>
<tr>
<th style="text-align:left">Label</th>
<th style="text-align:right">Index</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">B-Person</td>
<td style="text-align:right">0</td>
</tr>
<tr>
<td style="text-align:left">I-Person</td>
<td style="text-align:right">1</td>
</tr>
<tr>
<td style="text-align:left">B-Organization</td>
<td style="text-align:right">2</td>
</tr>
<tr>
<td style="text-align:left">I-Organization</td>
<td style="text-align:right">3</td>
</tr>
<tr>
<td style="text-align:left">O</td>
<td style="text-align:right">4</td>
</tr>
<tr>
<td style="text-align:left">START</td>
<td style="text-align:right">5</td>
</tr>
<tr>
<td style="text-align:left">END</td>
<td style="text-align:right">6</td>
</tr>
</tbody>
</table>
</div>
<p>We also have a sentence which has 5 words. The possible paths could be:</p>
<ul>
<li>1) START B-Person B-Person B-Person B-Person B-Person END</li>
<li>2) START B-Person I-Person B-Person B-Person B-Person END</li>
<li>…</li>
<li><strong>10) START B-Person I-Person O B-Organization O END</strong></li>
<li>…</li>
<li>N) O O O O O O O<a id="more"></a>
</li>
</ul>
<p>Suppose every possible path has a score $ P_{i} $ and there are totally $ N $ possible paths, the total score of all the paths is $ P_{total} = P_1 + P_2 + … + P_N = e^{S_1} + e^{S_2} + … + e^{S_N} $, $ e $ is the mathematical constant $ e $. (In section 2.4, we will explain how to calculate $ S_i $, you can also treat it as a score of this path.) </p>
<p>If we say the 10th path is the real path, in other words, the 10th path is the gold standard labels provided by our training dataset. The score $ P_{10} $ should be the one with the largest percentage in all the possible paths.</p>
<p>Given the equation which is also <strong>the loss function</strong> below, <strong>during the training process, the parameter values of our BiLSTM-CRF model will be updated again and again to keep increasing the percentage of the score of the real path</strong>.</p>
<p>$ Loss Function = \frac{P_{RealPath}}{P_1 + P_2 + … + P_N}  $</p>
<p>Now, the questions are:<br>1) How to define the score of a path?<br>2) How to calculate the total score of all possible paths?<br>3) When we calculate the total score, do we have to list all the possible paths? (The answer to this question is NO. )</p>
<p>In the following sections, we will see how to solve these questions.</p>
<h3 id="Next"><a href="#Next" class="headerlink" title="Next"></a>Next</h3><h4 id="2-4-Real-path-score"><a href="#2-4-Real-path-score" class="headerlink" title="2.4 Real path score"></a>2.4 Real path score</h4><p>How to calculate the score of the true labels of a sentence. </p>
<h4 id="2-5-The-score-of-all-the-possible-paths"><a href="#2-5-The-score-of-all-the-possible-paths" class="headerlink" title="2.5 The score of all the possible paths"></a>2.5 The score of all the possible paths</h4><p>How to calculate the total score of all the possible paths of a sentence with a step-by-step toy example.</p>
<p><strong>(Sorry for my late update, I will try my best to squeeze time for updating the following sections.)</strong></p>
<h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><p>[1]  Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K. and Dyer, C., 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.<br><a href="https://arxiv.org/abs/1603.01360" target="_blank" rel="external">https://arxiv.org/abs/1603.01360</a></p>
<blockquote>
<p>When you reprint or distribute this article, please include the original link address.</p>
</blockquote>

      
    </div>
    <footer class="article-footer">
      <a data-url="http://createmomo.github.io/2017/10/08/CRF-Layer-on-the-Top-of-BiLSTM-3/" data-id="ck0lc6yza0003ucp07dmsiv9l" class="article-share-link">Share</a>
      
        <a href="http://createmomo.github.io/2017/10/08/CRF-Layer-on-the-Top-of-BiLSTM-3/#disqus_thread" class="article-comment-link">Comments</a>
      
      
    </footer>
  </div>
  
    
<nav id="article-nav">
  
    <a href="/2017/10/17/CRF-Layer-on-the-Top-of-BiLSTM-4/" id="article-nav-newer" class="article-nav-link-wrap">
      <strong class="article-nav-caption">Newer</strong>
      <div class="article-nav-title">
        
          CRF Layer on the Top of BiLSTM - 4
        
      </div>
    </a>
  
  
    <a href="/2017/09/23/CRF_Layer_on_the_Top_of_BiLSTM_2/" id="article-nav-older" class="article-nav-link-wrap">
      <strong class="article-nav-caption">Older</strong>
      <div class="article-nav-title">CRF Layer on the Top of BiLSTM - 2</div>
    </a>
  
</nav>

  
</article>


<section id="comments">
  <div id="disqus_thread">
    <noscript>Please enable JavaScript to view the <a href="//disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
  </div>
</section>
</section>
        
          <aside id="sidebar">
  
    

  
    

  
    
  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Archives</h3>
    <div class="widget">
      <ul class="archive-list"><li class="archive-list-item"><a class="archive-list-link" href="/archives/2019/07/">July 2019</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2019/01/">January 2019</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/01/">January 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2017/12/">December 2017</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2017/11/">November 2017</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2017/10/">October 2017</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2017/09/">September 2017</a></li></ul>
    </div>
  </div>


  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Recent Posts</h3>
    <div class="widget">
      <ul>
        
          <li>
            <a href="/2019/07/18/Table-of-Contents/">Table of Contents</a>
          </li>
        
          <li>
            <a href="/2019/01/07/Probabilistic-Graphical-Models-Revision-Notes/">Probabilistic Graphical Models Revision Notes</a>
          </li>
        
          <li>
            <a href="/2018/01/23/Super-Machine-Learning-Revision-Notes/">Super Machine Learning Revision Notes</a>
          </li>
        
          <li>
            <a href="/2018/01/17/My-Life/">My Life</a>
          </li>
        
          <li>
            <a href="/2017/12/07/CRF-Layer-on-the-Top-of-BiLSTM-8/">CRF Layer on the Top of BiLSTM - 8</a>
          </li>
        
      </ul>
    </div>
  </div>

  
</aside>
        
      </div>
      <footer id="footer">
  
  <div class="outer">
    <div id="footer-info" class="inner">
      &copy; 2019 CreateMoMo<br>
      Powered by <a href="http://hexo.io/" target="_blank">Hexo</a>
    </div>
  </div>
</footer>
    </div>
    <nav id="mobile-nav">
  
    <a href="/" class="mobile-nav-link">Home</a>
  
    <a href="/archives" class="mobile-nav-link">Archives</a>
  
</nav>
    
<script>
  var disqus_shortname = 'createmomo';
  
  var disqus_url = 'http://createmomo.github.io/2017/10/08/CRF-Layer-on-the-Top-of-BiLSTM-3/';
  
  (function(){
    var dsq = document.createElement('script');
    dsq.type = 'text/javascript';
    dsq.async = true;
    dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
    (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
  })();
</script>


<script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>


  <link rel="stylesheet" href="/fancybox/jquery.fancybox.css">
  <script src="/fancybox/jquery.fancybox.pack.js"></script>


<script src="/js/script.js"></script>

  </div>
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        tex2jax: {
            inlineMath: [ ["$","$"], ["\\(","\\)"] ],
            skipTags: ['script', 'noscript', 'style', 'textarea', 'pre', 'code'],
            processEscapes: true
        }
    });
    MathJax.Hub.Queue(function() {
        var all = MathJax.Hub.getAllJax();
        for (var i = 0; i < all.length; ++i)
            all[i].SourceElement().parentNode.className += ' has-jax';
    });
</script>
<!-- <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>-->
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-MML-AM_CHTML"></script><!-- hexo-inject:begin --><!-- hexo-inject:end -->
</body>
</html>